Workload modeling for parallel computers
نویسنده
چکیده
The availability of good workload models is essential for the design and analysis of parallel computer systems. A workload model can be applied directly in an experimental or simulation environment to verify new scheduling policies or strategies. Moreover, it can be used for extrapolating and predicting future workload conditions. In this work, we focus on the workload modeling for parallel computers. To this end, we start with an examination of the overall features of the available workloads. Here, we find a strong sequential dependency in the submission series of computational jobs. Next, a new approach using Markov chains is proposed that is capable of describing the temporal dependency. Second, we analyze the missing attributes in some workloads. Our results show that the missing information can be still recovered when the relevant model is trained from other complete data set. Based on the results of overall workload analysis, we begin to inspect the workload characteristics based on particular user-level features. That is, we analyze in detail how the individual users use parallel computers. In particular, we cluster the users into several manageable groups, while each of these groups has distinct features. These different groups provide a clear explanation for the global characteristics of workloads. Afterwards, we examine the user feedbacks and present a novel method to identify them. These evidences indicate that some users have an adaptive tendency and a complete workload model should not ignore the users’ feedbacks. The work ends with a brief conclusion on the discussed modeling aspects and gives an outlook on future work.
منابع مشابه
Modeling of Parameters in Supercomputer Workloads
Evaluation methods for parallel computers often require the availability of relevant workload information. To this end, workload traces recorded on real installations are frequently used. Alternatively, workload models are applied. However, often not all necessary information are available for a specific workload. In this paper, a model is presented to recover an estimated job execution time wh...
متن کاملTECHNISCHE UNIVERSITÄT DORTMUND REIHE COMPUTATIONAL INTELLIGENCE COLLABORATIVE RESEARCH CENTER 531 Design and Management of Complex Technical Processes and Systems by means of Computational Intelligence Methods A Hybrid Markov Chain Modeling Architecture for Workload on Parallel Computers
This paper proposes a comprehensive modeling architecture for workloads on parallel computers using Markov chains in combination with state dependent empirical distribution functions. This hybrid approach is based on the requirements of scheduling algorithms: the model considers the four essential job attributes submission time, number of required processors, estimated processing time, and actu...
متن کاملParallel Computer Workload Modeling with Markov Chains
In order to evaluate different scheduling strategies for parallel computers, simulations are often executed. As the scheduling quality highly depends on the workload that is served on the parallel machine, a representative workload model is required. Common approaches such as using a probability distribution model can capture the static feature of real workloads, but they do not consider the te...
متن کاملParallel Spatial Pyramid Match Kernel Algorithm for Object Recognition using a Cluster of Computers
This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help u...
متن کاملA parallel workload model and its implications
We develop a workload model based on the observed behavior of parallel computers at the San Diego Supercomputer Center and the Cornell Theory Center. This model gives us insight into the performance of strategies for scheduling malleable jobs on space-sharing parallel computers. We nd that Adaptive Static Partitioning (ASP), which has been reported to work well for other workloads, is inferior ...
متن کامل